Skip to content

Add F16 precision toolkit (AVX2) + ARM NEON specialist agent#91

Merged
AdaWorldAPI merged 1 commit into
masterfrom
claude/setup-rust-smart-home-SOPAY
Apr 13, 2026
Merged

Add F16 precision toolkit (AVX2) + ARM NEON specialist agent#91
AdaWorldAPI merged 1 commit into
masterfrom
claude/setup-rust-smart-home-SOPAY

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

simd_avx2.rs — 3 precision tricks, all AVX2-accelerated (additive only):

Trick 1: Double-f16 (Error-Free Split)
f16_double_encode/decode: store value as hi+lo f16 pair
~20-bit effective precision (vs 10-bit single f16)
f16_double_encode/decode_batch: AVX2 F16C + f32x8 addition
Error: ≤2^{-21} × |value| (vs ≤2^{-11} for single f16)

Trick 2: Kahan-compensated accumulation
f16_kahan_sum: O(ε) error instead of O(N·ε) — independent of count
f16_kahan_dot: AVX2 f32x8 multiply + Kahan-accumulate partial sums

Trick 3: Exponent-aligned scaling (F16Scaler)
from_range/from_data: auto-compute scale factor for value range
encode/decode_batch: AVX2 f32x8 scale + F16C convert
Up to ~128× precision improvement for narrow-range data

⚠️ NOT FOR GGUF CALIBRATION — BF16 pipeline is separate

.claude/agents/arm-neon-specialist.md:
Complete ARM SBC knowledge: Pi Zero 2W through Pi 5, Orange Pi 3-5
Per-CPU microarchitecture (A53/A72/A76 pipeline differences)
big.LITTLE awareness (RK3399, RK3588)
F16 inline asm trick, codebook strategy per tier, memory budgets

6 new tests passing. No existing code modified.

https://claude.ai/code/session_017ZN5PNEf8boFBgorUZVrFU

simd_avx2.rs — 3 precision tricks, all AVX2-accelerated (additive only):

Trick 1: Double-f16 (Error-Free Split)
  f16_double_encode/decode: store value as hi+lo f16 pair
  ~20-bit effective precision (vs 10-bit single f16)
  f16_double_encode/decode_batch: AVX2 F16C + f32x8 addition
  Error: ≤2^{-21} × |value| (vs ≤2^{-11} for single f16)

Trick 2: Kahan-compensated accumulation
  f16_kahan_sum: O(ε) error instead of O(N·ε) — independent of count
  f16_kahan_dot: AVX2 f32x8 multiply + Kahan-accumulate partial sums

Trick 3: Exponent-aligned scaling (F16Scaler)
  from_range/from_data: auto-compute scale factor for value range
  encode/decode_batch: AVX2 f32x8 scale + F16C convert
  Up to ~128× precision improvement for narrow-range data

⚠️  NOT FOR GGUF CALIBRATION — BF16 pipeline is separate

.claude/agents/arm-neon-specialist.md:
  Complete ARM SBC knowledge: Pi Zero 2W through Pi 5, Orange Pi 3-5
  Per-CPU microarchitecture (A53/A72/A76 pipeline differences)
  big.LITTLE awareness (RK3399, RK3588)
  F16 inline asm trick, codebook strategy per tier, memory budgets

6 new tests passing. No existing code modified.

https://claude.ai/code/session_017ZN5PNEf8boFBgorUZVrFU
@AdaWorldAPI AdaWorldAPI merged commit b073060 into master Apr 13, 2026
4 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants